Associative memory device with optimized occupation, particularly for the recognition of words

ABSTRACT

A memory device includes an associative memory for the storage of data belonging to a plurality of classes. The associative memory comprises a plurality of memory locations aligned along rows and columns for the storage of data along the rows. Each memory row comprises a plurality of groups of memory locations, each storing a respective datum, wherein groups of memory locations adjacent along one and the same row store data belonging to different classes. Groups of memory locations adjacent in the direction of the columns and disposed on different rows store data belonging to one and the same class. Each class comprises data having a different maximum lengths. The device is particularly suitable for the storage of words belonging to a dictionary for automatic recognition of words in a written text.

TECHNICAL FIELD

[0001] The invention relates to an associative memory device withoptimized occupation, particularly for the recognition of words.

BACKGROUND OF THE INVENTION

[0002] As is known, for reading text, particularly handwritten text,various character recognition systems have been developed, based on textsegmentation to separate the individual characters or portions thereofone from another, and on processing of the segments obtained to identifythe characters. This procedure outputs a series of characters includingspaces and punctuation marks.

[0003] Current systems are not, however, always capable of outputtingcorrect data because of the presence of noise, the particular graphicalcharacteristics of the text or the limited capacities of the recognitionsystem. Consequently, further processing of the characters is necessaryso as to guarantee the correctness of the sequence of characters and theextraction of meaningful words.

[0004] For these reasons, word recognition devices have been proposedwhich compare an input word to be recognized with a plurality of wordsbelonging to a vocabulary, until a word in the vocabulary is identicalto or nearest to that to be recognized is identified. The comparisonprocedure, carried out sequentially on the words in the vocabulary,does, however, require a considerable amount of time.

[0005] To solve this problem, in a patent application filed by theapplicant on the same date, use is proposed of an associative memory, inwhich the word to be recognized is compared in parallel with all thestored words, enabling search times to be considerably reduced and hencepermitting effective use in a word recognition device.

[0006] In currently used associative memories, data are stored by lines,or each line (each row for example) is intended for the storage of asingle datum. For word recognition, however, given that the wordsbelonging to a vocabulary have different length, the problem ofoptimizing memory contents so as to be able to store a sufficient numberof words in a memory of modest size exists. Furthermore, the words mustbe easily searchable for comparison to the input word.

SUMMARY OF THE INVENTION

[0007] An object of the invention is to provide an associative memory,the organization of which is such as to improve optimization of memoryoccupation, improving extractability of stored data therefrom.

[0008] In general terms, the problem may be formulated as follows: givenan associative memory of size M×N, containing M vectors of size N, givena database of Z vectors (where Z>M) of dimensions which are not constantbut all less than N, the problem is that of organizing the associativememory in such a way as to store this database in the associativememory, optimizing its use.

[0009] In a first embodiment, the present invention includes a methodhaving steps of parsing input data to provide a portion delimited bypredetermined characteristics and determining a length of the portion.The method also includes steps of comparing the length to a table oflengths of data stored in an associative memory and providing theportion to a section of the associative memory containing data havinglengths comparable to the portion. The method further includes steps ofidentifying a closest match between a datum stored in the section andthe portion and storing an address from the associative memorycorresponding to the datum in a second memory.

[0010] In a second embodiment, the invention includes a method havingsteps of parsing a dataset comprising sequences of analog values todetermine lengths associated with each datum of the dataset, collectingdata from the dataset having comparable lengths into groups and writingeach group of the groups of data to a separate section of an associativeanalog memory.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] For an understanding of the invention a preferred embodiment willnow be described, purely by way of non-exhaustive example, withreference to the accompanying drawings in which:

[0012]FIG. 1 is a general block diagram of a word recognition deviceusing the associative memory according to the invention;

[0013]FIG. 2 shows the structure of the associative memory of FIG. 1;and

[0014]FIGS. 3 and 4 show tables relating to the organization of thememory of FIG. 2.

DETAILED DESCRIPTION OF THE INVENTION

[0015] In FIG. 1, the associative memory, denoted by 10, forms part of aword recognition device 1. The device 1 is located downstream of an OCRor optical character recognition system (not shown) of known type.

[0016] The device 1 comprises a control unit 2, which coordinates theactivities of the device, as described in this specification below, andhas an input 3 at which it receives, from the OCR system, strings ofcharacters on the basis of which the words are to be recognized; a datamemory 4, storing data necessary for the control unit 2 and coupledthereto; a switch matrix 6, coupled to the control unit 2 and, throughinput lines 25, to the memory 10; a reference voltage generator 7,coupled to the switch matrix 6 via lines 8; a selection block 11; apriority code generation block 12, coupled to the output of theselection block; and a memory element 13, coupled to the output of thepriority code generation block 12.

[0017] In detail, the control unit 2, which may be a microprocessor orother software processing unit, for example, determines the length ofsuccessive words, supplied by the character recognition system, on thebasis of the length of the strings of characters not separated by spacesor punctuation marks. On the basis of the architecture of the memory andof the coding used for the characters, it also commands the switchmatrix 6. For this purpose the data memory 4 stores data relating to theorganization of the memory 10 (in the example, it stores a table 26which supplies the correspondence between the length of the stored wordsand the columns of the memory 10 in which those words are stored, asdescribed below); data relating to the coding used for the individualcharacters (or it stores a table 27 which supplies the correspondencebetween each character of the alphabet and the associated relativeweight, e.g., voltage level); and data relating to the generation of theweights (for example, it stores a table 28 which supplies thecorrespondence between each weight and the line 8 on which the relativevoltage level is available). The voltage values corresponding to theweights of the different letters, according to this coding, aregenerated by the reference voltage generator 7 which may, for example,be provided as described in European patent application 96830498.0 filedon Sep. 30, 1996 in the name of this applicant. The switch matrix 6 maybe of any acceptable type of the many known in the prior art, such asthat described in European patent application 96830497.2 filed on Sep.30, 1996 in the name of the applicant. Consequently, on the basis of thecommands of the control unit 2, the switch matrix 6 is capable ofcoupling the lines 8 associated with the weights corresponding to theword to be recognized to at least some of the input lines 25 of thememory 10.

[0018] The hardware to implement the memory 10 comprises a memory of theassociative type or content addressable memory of a type well known inthe art. When it receives a datum formed by a sequence of elements atits input, it outputs a datum correlated to the address of the line(generally row) in which the datum which is closest to the input datumis stored. Preferably, the memory 10 is of the auto-associative type,i.e., it directly outputs the stored word closest to the input word. Forexample, the hardware to perform the memory function for the memory 10may be of any acceptable type of the many known in the prior art, suchas that described in the article by A. Kramer, M. Sabatini, R.Canegallo, M. Chinosi, P. L. Rolandi and P. Zabberoni entitled“Flash-Based Programmable Nonlinear Capacitor for Switched-CapacitorImplementations of Neural Networks” in IEDM Tech. Dig., pp.17.6.1-17.6.4, December 1994. In particular, this memory is capable ofautomatically outputting a voltage value proportional to the Manhattandistance between the input datum and the datum stored in each row, asexplained below.

[0019] In detail, as shown for clarity in FIG. 2, is one example of thehardware for the memory array 10 which comprises M×N pairs of cells 15(4000×64 pairs of cells for example), located in M rows 18 and N columns19. Each pair of cells 15 defines a memory location and comprises afirst cell 16 and a second cell 17. The drain and source terminals ofall the cells 16, 17 disposed on one and the same row are coupledtogether to the inverting input of an operational amplifier 20 in chargeintegration configuration, having a non-inverting input coupled to earthand output 21 coupled to the inverting input via a capacitor 22. A resetswitch 23 controlled by the control unit 2 (not illustrated) is locatedin parallel with the capacitor 22. The outputs 21 of the operationalamplifiers 20 define the outputs of the memory array 10.

[0020] The gate terminals of the first cells 16 belonging to the samecolumn are coupled to the same input line 25 of the memory whilst thegate terminals of the second cells 17 belonging to the same column arecoupled to a respective different input 25. With this configuration, asdescribed in detail in the above-mentioned article by Kramer et al., bystoring a pre-determined voltage value in each pair of cells 15 and bysupplying complementary voltage values Vg and V'g at the inputs 25 ofthe two cells 16, 17 of a pair 15, a voltage value is obtained at eachoutput 21 of the array 10. This voltage is proportional to the Manhattandistance between the input datum and the datum stored in each row.

[0021] The distance values present at the outputs 21 of the memory 10are supplied to the selection block 11 for identification of the rowshaving shorter distance; the selection block 21 is of known type anddescribed, for example, in “Winner-take-all-networks of O(n) complexity”by Lazzaro, S. Ryckenbusch, M. A. Mahowald and C. Mead in Tourestzky D.(ed), Advances in Neural Network Information Processing Systems 1. SanMateo, Calif.: Morgan Kauffmann Publisher, pp. 703-711 (1988). Theaddresses of the data at minimum distance (or the datum) are thensupplied to the priority code generation block 12 which places them in apriority code, starting from the datum at minimum distance, and then tothe memory element 13 (an EEPROM, ROM, RAM or other memory for example)for them to be stored.

[0022] The word recognition device 1 of FIG. 1 operates as follows.Initially a dictionary I is selected, or a base of meaningful words in acertain language. A coding of the dictionary is then defined in such away as to show the characteristics of the language in a readilycomputable way. As indicated, the coding takes place by associating anappropriate voltage value (weight) to each character of the alphabet.The dictionary is then inserted into the memory 10 using the codingstored in the table 27, storing several words in each row of the memory,as described below.

[0023] Subsequently, the sequence of characters belonging to a word tobe recognized is input into the memory 10, using the same coding of thecharacters used to store the dictionary. Specifically, on the basis ofthe coding table 27 stored in the data memory 4, the control unit 2provides commands to the switch matrix 6 so that the matrix 6 suppliesthe corresponding pairs of voltage values which are complementary toeach other and generated by the reference voltage generator 7 to theinput lines 25 of the associative memory 10.

[0024] The memory 10 then calculates the distance between the word to berecognized and each of the words stored in the memory 10 or in thedesired portion thereof, i.e., calculates the sum of the distancebetween the weights corresponding to the characters forming the word tobe recognized and the weights corresponding to the characters formingthe words stored by it in the addressed portions of rows. In particular,if we call the coding of a single element (character) of a stored wordas and the coding of a corresponding element (character) of the word tobe recognized b_(i), the memory 10 calculates the distance dist definedas:${dist} = {\sum\limits_{j = 1}^{L}{\theta \left( {a_{i},b_{i}} \right)}}$

[0025] in which L is the length of the word to be recognized and θrepresents the generic distance calculation function.

[0026] On the basis of this distance, as described above, the blocks11-13 are capable of showing and storing the addresses of the rows ofthe associative memory 10 relating to the words which are closest to theinput word or directly storing the words.

[0027] To optimize occupation of the associative memory 10 in view ofthe presence of words of variable length, according to the invention itis proposed to organize the memory by dividing it into sub-groups(groups of columns or of rows) which are selectively addressable by thecontrol unit 2 through the switch matrix 6, and then to carry out adedicated search which considers only the words linked to the inputconfiguration, or having homologous dimensions.

[0028] In detail,

[0029] given the memory 10 of dimension M×N, in which it may be possibleto exclude from the computation a number n of non-consecutive lines(columns);

[0030] given the base I (dictionary) of storable configurations (words)of different length (but≦N), also comprising different types of data;

[0031] the base I is divided into a number S of classes, each containingconfigurations having the same maximum length;

[0032] indicating by max(j) the maximum length of the configurationscontained in the class j, plus an arbitrary number of additionalelements (such as the frequency of the configuration—word—expressed as acodified number), whenever the following inequality is met:

max(1)+max(2)+ . . . +max(j−1)+max(j)≦N

[0033] for j≦S and t≦S. This configuration excludes at most a limitednumber of elements of the base I,

[0034] the memory is organized in such a way that each line of memorycomprises a plurality (j) of groups of memory locations, each group oflocations of a line being intended for the storage of a configuration(word), wherein adjacent groups of memory locations of one and the sameline (e.g., row) store different configurations (words) of differentmaximum length, whilst groups of memory locations belonging to differentlines but adjacent to the memory lines (e.g., columns) storeconfigurations (words) belonging to one and the same class (having thesame length).

[0035] An example of organization of the memory 10 in the case in whichthe configurations (words) are stored in rows is shown in the table ofFIG. 3, in which the columns of the memory 10 are re-grouped into groupsof columns each associated with a different class of the base I (and thenumber of columns of each group is equal to the maximum length of theconfigurations belonging to the respective class) and the configurations(words) belonging to one and the same class are stored in different rowsof the respective group of columns.

[0036] Given this organization, for the word recognition device 1 ofFIG. 1, by considering a dictionary I of approx. 25,000 words ofdifferent length, taking into account that the frequency of the words ina text decreases as the length of the words increases and that words oflength greater than 24 characters represent 0.4% of the total, it ispossible to sub-divide the memory 10 in the manner illustrated in detailin the table of FIG. 4, which represents the content of the table 26stored in the data memory 4.

[0037] The organization described above enables 90% occupation of thememory to be obtained with only 0.4% of words not stored.

[0038] With this type of organization, word recognition takes place bycomparing the word supplied to the inputs 25 of the memory 10 with thewords stored in the corresponding group of columns.

[0039] The organization described above enables, among other things,different types of data to be loaded onto the same row, associating themwith the classes organized by columns and then selecting the columnsnecessary for the calculation on the basis of the data required. Forexample, as an alternative to that shown in the table of FIG. 4, inwhich the memory stores only completed words, it is possible to store inthe same memory in part the weights used for the recognition of theindividual characters and in part the weights used for the recognitionof the words, thus using a single memory both for recognition ofcharacters (OCR) and for recognition of words.

[0040] The advantages that can be obtained with the memory organizationdescribed are as follows. The optimization of the occupation that can beobtained enables the size of the memory to be reduced for a givenapplication, with the stored configurations being the same, or thenumber of storable configurations to be increased. It may be used forthe storage of data of different type, by assigning, for example, afirst type of datum to be processed to a sub-group of columns and asecond, different, type of datum to another sub-group, permitting aflexible use of the memory. Furthermore, the data remain easilyaccessible without introducing circuit or logic complexities in theoverall memory device for the extraction of the data or theircomputation.

[0041] Finally it will be clear that numerous modifications andvariants, all of which come within the scope of the inventive concept,may be introduced to the memory described and illustrated here. Inparticular, it is emphasized that the application in a word recognitiondevice, as described, is purely by way of illustration. The memory 10may be either of the auto-associative type (the type of datum output isequal to the type of datum input) or of the hetero-associative type (thetype of datum output is different from that input, for example only theaddress of the line satisfying a given computation is output).

1. A memory device including an associative memory for the storage ofdata belonging to a plurality of classes, said associative memorycomprising a plurality of memory locations aligned along a first and asecond direction for the storage of the data along lines of memoryextending along said first direction, each line of memory of saidassociative memory comprising a plurality of groups of memory locations,each group of locations of a line storing a respective datum, whereingroups of memory locations adjacent in said first direction andbelonging to one and the same line store data belonging to differentclasses, and groups of memory locations adjacent in said seconddirection and belonging to different lines store data belonging to oneand the same class.
 2. A device as claimed in claim 1 wherein each ofsaid classes comprises data having a same maximum length and adjacentgroups of locations of a line store data having different maximumlengths.
 3. A device as claimed in claim 1 wherein said data comprisesequences of values codifying characters forming words of a dictionary.4. A device as claimed in claim 1 wherein said lines comprise rows ofmemory and the data belonging to one and the same class are stored inone and the same group of columns.
 5. A device as claimed in claim 6,comprising selectively enabling means for said groups of memorylocations which are adjacent in said second direction.
 6. A device asclaimed in claim 6, comprising storage means for storing acorrespondence between said classes of data and the addresses of saidgroups of memory locations.
 7. A device as claimed in claim 1 whereinsaid associative memory comprises a flash memory for storing analogsignals.
 8. A device as claimed in claim 1 wherein said associativememory comprises: a first group of sections for storing data comprisinganalog weights used for recognition of words; and a second group ofsections for storing data comprising analog weights used for recognitionof characters, wherein said first and second sections are mutuallyexclusive.
 9. A method comprising: parsing input data to provide aportion delimited by predetermined characteristics; determining a lengthof said portion; comparing said length to a table of lengths of datastored in an associative analog memory; providing said portion to asection of said associative analog memory containing data having lengthscomparable to said portion; identifying a closest match between a datumstored in said section and said portion; and storing an address fromsaid associative analog memory corresponding to said datum in a secondmemory.
 10. A method as claimed in claim 9 wherein said step of storingan address from said associative analog memory comprises a step ofstoring a binary address in a digital memory, wherein said binaryaddress is from said associative analog memory and corresponds to saiddatum.
 11. A method as claimed in claim 9 wherein said step of parsinginput data includes a step of parsing input data comprising a sequenceof analog values to provide a portion delimited by predeterminedcharacteristics.
 12. A method as claimed in claim 9 wherein said step ofparsing input data includes a step of parsing input data comprisingsequences of analog values individually representing alphanumericcharacters to provide a portion delimited by one or more elements chosenfrom a group consisting of: spaces, commas, periods, question marks,exclamation points, parentheses, brackets, colons, semicolons andquotation marks.
 13. A method as claimed in claim 9 wherein said step ofidentifying a closest match includes steps of: calculating a Manhattandistance between said portion and each datum stored in said section;determining a least Manhattan distance between said portion and at leastone datum stored in said section; and providing binary address data forsaid at least one datum; and wherein: said step of storing an addressincludes a step of storing said binary address from said associativeanalog memory corresponding to said datum in a digital memory.
 14. Amethod as claimed in claim 13, further comprising steps of: providingsaid binary address data to a priority code generation device to derivepriority code data corresponding to said binary address data; andstoring one binary address of said binary address data in a digitalmemory based on said priority code data.
 15. A method as claimed inclaim 9 wherein said step of parsing input data includes a step ofparsing input data comprising a sequence of analog values into either afirst group representing alphanumeric data or a second grouprepresenting character recognition data.
 16. A method as claimed inclaim 9 wherein said step of identifying a closest match includes stepsof: supplying output signals from said associative analog memory to awinner-take-all circuit; and determining a least Manhattan distancebetween said portion and at least one datum from said section.
 17. Amethod comprising: parsing a dataset comprising sequences of analogvalues representing words to determine a length associated with eachword of said dataset; collecting sequences of analog values representinga word from said dataset having comparable lengths into groups; andwriting each group of said groups of data to a separate section of anassociative analog memory.
 18. A method as claimed in claim 17 whereinsaid step of parsing a dataset includes steps of: parsing a firstdataset into a first ensemble representing weights used for recognizingcharacters; and parsing a second dataset into a second ensemblerepresenting weights used for recognizing words.
 19. A method as claimedin claim 18 wherein said step of writing each group of data includessteps of: writing data comprising said first ensemble to a first area ofsaid associative analog memory; and writing data comprising said secondensemble to a second area of said associative analog memory, whereinsaid first and second areas are mutually exclusive.